Empowering agricultural research: A comprehensive custard apple (Annona squamosa) disease dataset for precise detection

The Custard Apple, known as sugar apple or sweetsop, spans diverse regions like India, Portugal, Thailand, Cuba, and the West Indies. This dataset holds 8226 images of Custard Apple (Annona squamosa) fruit and leaf diseases, categorized into six types: Athracnose, Blank Canker, Diplodia Rot, Leaf Spot on fruit, Leaf Spot on leaf, and Mealy Bug. It's a key resource for refining machine learning algorithms focused on detecting and classifying diseases in Custard Apple plants. Utilizing methods like deep learning, feature extraction, and pattern recognition, this dataset sharpens automated disease identification precision. Its extensive range suits testing and training disease identification techniques. Public access fosters collaboration, fast-tracking plant pathology advancements and supporting Custard Apple plant sustainability. This dataset fosters collaborative efforts, aiding disease prevention techniques to boost Custard Apple yield and refine farming. It enhances disease identification, monitoring, and management in Custard Apple production, aiming to elevate agricultural practices and crop yields.


Background
Custard apple, also known as Sugar apple, is a subtropical fruit in the Annonacea family, recognized by its tough green skin and creamy interior with various shapes.Its sweet, granular flesh contains numerous seeds and is rich in fibers, minerals, and vitamins, offering health benefits.The fiber aids digestion, prevents constipation, and supports detoxification.Packed with antioxidants like flavonoids, phenolic compounds, kaurenoic acid, and vitamin C, custard apple fights free radicals associated with chronic diseases, cancer, and heart conditions.Lutein and carotenoid antioxidants protect eyes from oxidative damage, reducing the risk of age-related macular degeneration, vision loss, and cataracts.
The development of the Custard Apple Leaf and Fruit Disease Image Dataset was inspired by the inherent medicinal properties and economic significance in the food and pulp making industry.Given its medicinal applications and substantial market value, industries demand top-notch custard apples.The health and quality of these plant components are pivotal for fruit-based food industries.However, diseases can impede fruit production.Therefore, the identification of diseases becomes crucial.To address this need, we formulated a disease dataset for both fruit and leaf, encompassing various types of diseases.This initiative not only caters to the interests of food industries seeking superior fruit quality but also benefits farmers by providing a tool for evaluating the quality in their cultivation practices.

Data Description
The image datasets play a crucial role in various fields, ranging from computer vision and machine learning to medical research and social sciences.These datasets provide a rich source of visual information that enables researchers, developers, and professionals to train and validate their models, algorithms, and theories.By having access to diverse and well-curated image datasets, researchers can explore new possibilities, enhance the accuracy and robustness of their models, and gain valuable insights into patterns, trends, and relationships within the visual data.
An image dataset specific to custard apple fruit and leaf diseases holds significant importance in the agricultural domain.Such datasets provide researchers, agronomists, and farmers with a valuable resource to identify, classify, and study various fruit and leaf diseases..By analysing these images, experts can develop more accurate disease detection algorithms and early warning systems.This aids in prompt disease management, preventing widespread crop damage and yield loss.Additionally, a comprehensive dataset allows for the exploration of disease patterns, environmental factors, and potential mitigation strategies.In summary, a Custard apple fruit and leaf disease image dataset plays a pivotal role in advancing research, improving crop management practices, and ensuring the overall health and productivity of custard apple.
This Custard Apple fruit and leaf Dataset [1] contains a diverse collection of 8226 highresolution images.The images are stored in JPEG format and have dimensions of 768 × 1024 pixels.The dataset is categorized into 06 distinct classes, including 1 leaf disease category and 5 fruit disease categories.The disease categories cover a range of common custard apple fruit disease, such as Athracnose, Blank Canker, Diplodia Rot, Leaf Spot on fruit, and Mealy Bug ( Table 1 ).It also covers common leaf disease Leaf spot ( Table 1 ).Each category is labelled and organized in separate folders, ensuring easy access and identification of specific disease samples.The Fig. 1 shows directory structure of custard apple disease dataset.The images were collected through extensive field surveys conducted in Custard apple growing regions.The data collection process involved using quality cameras to capture images from various angles, including both sides of the fruit and leaves.Images were taken in the field and by cutting/separating individual fruit and leaves, capturing different stages and manifestations of the diseases.This approach ensures a comprehensive representation of the visual characteristics of custard apple diseases within the dataset.The dataset's images are of high quality, with a resolution set at 72 dots per inch (dpi), ensuring clear and detailed visual representation of the custard apple disease samples.Table 1 shows Custard Apple disease dataset with its categories and count per category.Fig. 1 illustrates Custard Apple Disease dataset folder structure.Table 2 shows exemplar images from the dataset.
Prior research has extensively explored disease detection in fruit crops focusing on imagebased classification and machine learning algorithms [2][3][4][5][6][7][8][9][10].This dataset's main goal is to offer a thorough selection of high-resolution images of custard apple fruit and leaves that cover a range of diseases.The dataset is to aid in the development and assessment of machine learning methods and algorithms for custard apple disease detection and classification.The scientific community is encouraged to collaborate and share information via the open availability of this dataset.The primary aim of this dataset is to facilitate the development of efficient algorithms for disease detection and classification, enhance disease management strategies, and eventually aid in the sustainable production of custard apples by researchers and practitioners.

Experimental design
The Custard Apple Disease dataset was generated through the acquistion of images using high resolution rear cameras of Samsung F23 5 G Mobile.The Fig. 2 provides a summary of the data acquisition steps undertaken for the project.
Step 1: Image Acqusition Phase (Duration August to September): In this phase, We performed field visits in daylight to capture photographs depicting different diseases affecting custard apple fruits and leaves, aiming to compile a thorough collection of images related to these diseases.
Step 2: Image Preprocessing Phase (Duration-October): During this phase, the collected images underwent a review process, and suitable ones for the dataset were chosen.The selected images were then subjected to pre-processing, which could involve actions such as resizing, cropping, and enhancing as required.
Step 3: Image Classification (Duration-November): Image classification is essential process to create set of specific disease dataset which includes batch conversion with labelling the image with a specific sequence number.Fig. 3 shows stepwise Classification of images.Table 3 shows image classification parameters for dataset images.
Step 1: create a dataset of each disease and allocate all the images within specific dataset Step 2: Use Image processing and classification software IrfanView 64bit version 4.62.It is used for batch conversion of disease specific image.Batch conversion is a very important process for classification of images.
Step 3: Configuration of images with specific parameter is important step for batch conversion.
In batch conversion it consider following parameters.

Data collection phases
Fig. 3 illustrates the sequential progression of steps involved in curating the dataset, from field identification to dataset compilation.It delineates the six key stages involved in the dataset collection process:  6. Image Storage in Dataset: The classified images are then stored systematically within the dataset for further analysis and research purposes.

Materials or specification of image acquisition system
The cameras used in the data acquisition process and the specifications of the captured images: Samsung Galaxy F 23 5 G Android Mobile: • Make and Model: Samsung Galaxy F 23 5 G (SM-E236B) Android Mobile.
During the data collection process, efforts were made to adhere to standardized image acquisition practices, capturing each image using the rear cameras of a Samsung F23 5 G Mobile known for its high-resolution imaging capabilities.This maintained consistency and quality throughout the dataset.The captured images were saved in JPG format and resized to a resolution of 768 × 1024 pixels.To reduce errors and improve dependability, this multi-step labeling procedure involves careful examination and cross-validation by several experts in the field.In addition to giving the dataset more legitimacy, the validation of disease categories by subject matter experts guaranteed that each disease's visual attributes were accurately represented.

Methods
The dataset was collected through field visits to sugar apple farms.Authors collaborated with farmers to examine and capture high-resolution images of sugar apple fruit.These images underwent detailed classification based on disease presence and other attributes, forming a systematically organized dataset for analysis and research.In order to gather information for the custard apple tree disease dataset, a farm in Nimgaon Bhogi, Taluka-Shirur, District-Pune, India was visited.Images were taken in a variety of settings, such as with fruits and leaves in their natural state and after they were cut off or separated from the plant.This made it possible to depict custard apple diseases in all of their variations in a thorough manner.

Limitations
The dataset is collected from a specific region, potentially limiting its applicability to other geographical areas with different disease prevalence or manifestations.

Ethics Statement
Our study does not involve studies with animals or humans.Therefore, we confirm that our research strictly adheres to the guidelines for authors provided by Data in Brief terms of ethical considerations.

Sugar
The Custard Apple, known as sugar apple or sweetsop, spans diverse regions like India, Portugal, Thailand, Cuba, and the West Indies.This dataset holds 8226 images of Custard Apple (Annona squamosa) fruit and leaf diseases, categorized into six types: Athracnose, Blank Canker, Diplodia Rot, Leaf Spot on fruit, Leaf Spot on leaf, and Mealy Bug.It's a key resource for refining machine learning algorithms focused on detecting and classifying diseases in Custard Apple plants.Utilizing methods like deep learning, feature extraction, and pattern recognition, this dataset sharpens automated disease identification precision.Its extensive range suits testing and training disease identification techniques.Public access fosters collaboration, fast-tracking plant pathology advancements and supporting Custard Apple plant sustainability.This dataset fosters collaborative efforts, aiding disease prevention techniques to boost Custard Apple yield and refine farming.It enhances disease identification, monitoring, and management in Custard Apple production, aiming to elevate agricultural practices and crop yields.© 2024 The Author(s).Published by Elsevier Inc.This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ ) Specifications

Table Subject
To ensure the dataset's relevance and diversity, images were collected from various Custard Apple plantations in the Nimgaon-Bhogi region, considering different growth stages, environmental conditions, and disease manifestations.The data collection process encompassed several stages to ensure a comprehensive representation of Custard apple fruit and leaf samples.The data contain images of different diseases which includes Athracnose, Blank Canker, Diplodia Rot, Leaf Spot on fruit, Leaf Spot on leaf, and Mealy Bug.The captured images were saved in JPG format and resized to a resolution of 1024 × 768 pixels.Extensive field surveys were conducted to gather a diverse range of fruit and leaves affected by various diseases.High-resolution images of Custard apple leaves were captured using quality cameras, employing multiple angles to capture different perspectives of the leaves.This included capturing images from both sides of the leaves and fruit to capture a holistic view of their condition.

1. Value of the Data
With 8226 images, researchers can develop and evaluate models using deep learning, feature extraction, and pattern recognition, enhancing disease detection accuracy.In the scientific field of plant pathology, it promotes collaborations and novel approaches.This Custard Apple dataset is a valuable resource with applications in research, agriculture, technology development, and disease detection.

Table 1
Quantitative breakdown: image count per custard apple disease category.

Table 2
Sample images of different custard apple diseases.
( continued on next page )

Table 3
Key image classification metrics for disease dataset.
"International Press Telecommunications Council'' is what IPTC stands for.A standardized metadata format known as IPTC data was developed specifically for media and news agencies to use.Title, description, and location are among the image details that are included.